Crafting Monocular Cues and Velocity Guidance for Self-Supervised Multi-Frame Depth Learning
نویسندگان
چکیده
Self-supervised monocular methods can efficiently learn depth information of weakly textured surfaces or reflective objects. However, the accuracy is limited due to inherent ambiguity in geometric modeling. In contrast, multi-frame estimation improve thanks success Multi-View Stereo (MVS), which directly makes use constraints. Unfortunately, MVS often suffers from texture-less regions, non-Lambertian surfaces, and moving objects, especially real-world video sequences without known camera motion supervision. Therefore, we propose MOVEDepth, exploits MOnocular cues VElocity guidance Depth learning. Unlike existing that enforce consistency between depth, MOVEDepth boosts learning by addressing problems MVS. The key our approach utilize as a priority construct cost volume, adjust candidates volume under predicted velocity. We further fuse uncertainty results robust against multi-view geometry. Extensive experiments show achieves state-of-the-art performance: Compared with Monodepth2 PackNet, method relatively improves 20% 19.8% on KITTI benchmark. also generalizes more challenging DDAD benchmark, outperforming ManyDepth 7.2%. code available at https://github.com/JeffWang987/MOVEDepth.
منابع مشابه
Self-Supervised Monocular Image Depth Learning and Confidence Estimation
Convolutional Neural Networks (CNNs) need large amounts of data with ground truth annotation, which is a challenging problem that has limited the development and fast deployment of CNNs for many computer vision tasks. We propose a novel framework for depth estimation from monocular images with corresponding confidence in a selfsupervised manner. A fully differential patch-based cost function is...
متن کاملtight frame approximation for multi-frames and super-frames
در این پایان نامه یک مولد برای چند قاب یا ابر قاب تولید شده تحت عمل نمایش یکانی تصویر برای گروه های شمارش پذیر گسسته بررسی خواهد شد. مثال هایی از این قاب ها چند قاب های گابور، ابرقاب های گابور و قاب هایی برای زیرفضاهای انتقال پایاست. نشان می دهیم که مولد چند قاب تنک نرمال شده (ابرقاب) یکتا وجود دارد به طوری که مینیمم فاصله را از ان دارد. همچنین مسایل مشابه برای قاب های دوگان مطرح شده و برخی ...
15 صفحه اولFusion of stereo and still monocular depth estimates in a self-supervised learning context
We study how autonomous robots can learn by themselves to improve their depth estimation capability. In particular, we investigate a self-supervised learning setup in which stereo vision depth estimates serve as targets for a convolutional neural network (CNN) that transforms a single still image to a dense depth map. After training, the stereo and mono estimates are fused with a novel fusion m...
متن کاملCombining Monocular and Stereo Depth Cues
A lot of work has been done extracting depth from image sequences, and relatively less has been done using only single images. Very little has been done merging these together. This paper describes the fusing of depth estimation from two images, with monocular cues. The paper will provide an overview of the stereo algorithm, and the details of fusing the stereo range data with monocular image f...
متن کاملDepth Estimation Using Monocular and Stereo Cues
Depth estimation in computer vision and robotics is most commonly done via stereo vision (stereopsis), in which images from two cameras are used to triangulate and estimate distances. However, there are also numerous monocular visual cues— such as texture variations and gradients, defocus, color/haze, etc.—that have heretofore been little exploited in such systems. Some of these cues apply even...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i3.25368